Bulk Scheduling with DIANA Scheduler

نویسندگان

  • Ashiq Anjum
  • Richard McClatchey
  • Arshad Ali
  • Ian Willers
چکیده

Results from and progress on the development of a Data Intensive and Network Aware (DIANA) Scheduling engine, primarily for data intensive sciences such as physics analysis, are described. Scientific analysis tasks can involve thousands of computing, data handling, and network resources and the size of the input and output files and the amount of overall storage space allocated to a user necessarily can have significant bearing on the scheduling of data intensive applications. If the input or output files must be retrieved from a remote location, then the time required transferring the files must also be taken into consideration when scheduling compute resources for the given application. The central problem in this study is the coordinated management of computation and data at multiple locations and not simply data movement. However, this can be a very costly operation and efficient scheduling can be a challenge if compute and data resources are mapped without network cost. This can result in performance degradation particularly if no advantage is taken by a scheduling engine of recent advances in networking technologies and bandwidth abundance. We have implemented an adaptive algorithm within the DIANA Scheduler which takes into account data location and size, network performance and computation capability to make efficient global scheduling decisions. DIANA is a performance-aware as well as an economy-guided Meta Scheduler. It iteratively allocates each job to the site that is likely to produce the best performance as well as optimizing the global queue for any remaining pending jobs. Therefore it is equally suitable whether a single job is being submitted or bulk scheduling is being performed. Results suggest that considerable performance improvements are to be gained by adopting the DIANA scheduling approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scheduling in Data Intensive and Network Aware (DIANA) Grid Environments

In Grids scheduling decisions are often made on the basis of jobs being either data or computation intensive: in data intensive situations jobs may be pushed to the data and in computation intensive situations data may be pulled to the jobs. This kind of scheduling, in which there is no consideration of network characteristics, can lead to performance degradation in a Grid environment and may r...

متن کامل

Starvation Free Scheduler for Buffered Crossbar Switches (RESEARCH NOTE)

Need for high speed internet connectivity has lead to a substantial research in switching systems. Buffered crossbar switches have received a lot of attention from both research and industrial communities due of its flexibility and scalability. Designing a scheduling algorithm for buffered crossbar switches without starvation is a major challenge as of now. In this paper, we proposed a Delay ba...

متن کامل

Hawk: Hybrid Datacenter Scheduling

This paper addresses the problem of efficient scheduling of large clusters under high load and heterogeneous workloads. A heterogeneousworkload typically consists of many short jobs and a small number of large jobs that consume the bulk of the cluster’s resources. Recent work advocates distributed scheduling to overcome the limitations of centralized schedulers for large clusters with many comp...

متن کامل

Locality Aware Work-Stealing based Scheduling in Hybrid CPU-GPU Clusters

We study work-stealing based scheduling on a cluster of nodes with CPUs and GPUs. In particular, we evaluate locality aware scheduling in the context of distributed shared memory style programming, where the user is oblivious to data placement. Our runtime maintains a distributed map of data resident on various nodes and uses it to estimate the affinity of work to different nodes to guide sched...

متن کامل

Vassal: Loadable Scheduler Support for Multi-Policy Scheduling

This paper presents Vassal, a system that enables applications to dynamically load and unload CPU scheduling policies into the operating system kernel, allowing multiple policies to be in effect simultaneously. With Vassal, applications can utilize scheduling algorithms tailored to their specific needs and generalpurpose operating systems can support a wide variety of special-purpose scheduling...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/cs/0602026  شماره 

صفحات  -

تاریخ انتشار 2006